A Novel Mapreduce Lift Association Rule Mining Algorithm (Mrlar) for Big Data
نویسندگان
چکیده
Big Data mining is an analytic process used to discover the hidden knowledge and patterns from a massive, complex, and multi-dimensional dataset. Single-processor's memory and CPU resources are very limited, which makes the algorithm performance ineffective. Recently, there has been renewed interest in using association rule mining (ARM) in Big Data to uncover relationships between what seems to be unrelated. However, the traditional discovery ARM techniques are unable to handle this huge amount of data. Therefore, there is a vital need to scalable and parallel strategies for ARM based on Big Data approaches. This paper develops a novel MapReduce framework for an association rule algorithm based on Lift interestingness measurement (MRLAR) which can handle massive datasets with a large number of nodes. The experimental result shows the efficiency of the proposed algorithm to measure the correlations between itemsets through integrating the uses of MapReduce and LIM instead of depending on confidence. Keywords—Big Data; Data Mining; Association Rule; MapReduce; Lift Interesting Measurement
منابع مشابه
Data Mining Using Clouds: An Experimental Implementation of Apriori over MapReduce
Cloud computing has become a viable mainstream solution for data processing, storage and distribution. It promises on demand, scalable, pay-as-you-go compute and storage capacity. To analyze “big data” on clouds, it is very important to research data mining strategies based on cloud computing paradigm from both theoretical and practical views. For this purpose, we study a strategy of data minin...
متن کاملMining Perfectly Rare Itemsets on Big Data: An Approach Based on Apriori-Inverse and MapReduce
Association rule mining is one of the most common data mining techniques used to identify and describe interesting relationships between patterns from large quantities of data. Whereas many researches have been focused on the extraction of these patterns which appear frequently to obtain general information, in some scenarios it could also be interesting to extract unexpected phenomena. Rare as...
متن کاملInput Split Frequent Pattern Tree Using Mapreduce Paradigm in Hadoop
Big data has been attracted in information industry and in the society in the recent years, due to the wide availability of huge amount of data in the Internet and the complexity of data is growing every day. Hence distributed data mining algorithms has decided to exploit big data adaptable to current technology. Since there exist some limitations in traditional algorithm for dealing with the m...
متن کاملA Novel Method for Selecting the Supplier Based on Association Rule Mining
One of important problems in supply chains management is supplier selection. In a company, there are massive data from various departments so that extracting knowledge from the company’s data is too complicated. Many researchers have solved this problem by some methods like fuzzy set theory, goal programming, multi objective programming, the liner programming, mixed integer programming, analyti...
متن کاملAn Algorithm for Mining Frequent Itemsets from Library Big Data
Frequent itemset mining plays an important part in college library data analysis. Because there are a lot of redundant data in library database, the mining process may generate intra-property frequent itemsets, and this hinders its efficiency significantly. To address this issue, we propose an improved FP-Growth algorithm we call RFP-Growth to avoid generating intra-property frequent itemsets, ...
متن کامل